Goto

Collaborating Authors

 ap 0




UniSite: The First Cross-Structure Dataset and Learning Framework for End-to-End Ligand Binding Site Detection

arXiv.org Artificial Intelligence

The detection of ligand binding sites for proteins is a fundamental step in Structure-Based Drug Design. Despite notable advances in recent years, existing methods, datasets, and evaluation metrics are confronted with several key challenges: (1) current datasets and methods are centered on individual protein-ligand complexes and neglect that diverse binding sites may exist across multiple complexes of the same protein, introducing significant statistical bias; (2) ligand binding site detection is typically modeled as a discontinuous workflow, employing binary segmentation and subsequent clustering algorithms; (3) traditional evaluation metrics do not adequately reflect the actual performance of different binding site prediction methods. To address these issues, we first introduce UniSite-DS, the first UniProt (Unique Protein)-centric ligand binding site dataset, which contains 4.81 times more multi-site data and 2.08 times more overall data compared to the previously most widely used datasets. We then propose UniSite, the first end-to-end ligand binding site detection framework supervised by set prediction loss with bijective matching. In addition, we introduce Average Precision based on Intersection over Union (IoU) as a more accurate evaluation metric for ligand binding site prediction. Extensive experiments on UniSite-DS and several representative benchmark datasets demonstrate that IoU-based Average Precision provides a more accurate reflection of prediction quality, and that UniSite outperforms current state-of-the-art methods in ligand binding site detection. The dataset and codes will be made publicly available at https://github.com/quanlin-wu/unisite.


Robust Pan-Cancer Mitotic Figure Detection with YOLOv12

arXiv.org Artificial Intelligence

Detecting mitotic figures (MFs) in histopathology images remains a challenging task. Their quantification traditionally relies on the manual identification of "hot spots" by pathologists, followed by visual counting--an approach that is inherently subjective and may not reliably reflect the true prolifer-ative activity of a tumor. With the rise of digital pathology and artificial intelligence, numerous efforts have been made to automate mitosis detection in order to enhance accuracy, reproducibility, and scalability. Among these, the MItosis DOmain Generalization (MIDOG) challenges have emerged as a key benchmark for evaluating the generalizability of detection algorithms under realistic domain shifts. The 2021 edition (1) addressed scanner-induced variability using breast cancer WSIs, while the 2022 edition (2) extended the scope to include multiple tissue types and species, introducing further biological diversity. The 2025 MIDOG challenge (3) builds on these foundations with the most comprehensive mitosis-annotated dataset to date, and introduces two tasks: (1) detecting mitotic figures in arbitrary tumor tissue, and (2) determining whether a mitotic figure is atypical or normal. These tasks represent a significant step toward developing robust mitosis detection systems that generalize across diverse and complex histological conditions. In this work, we present a high-performance detection pipeline based on the YOLOv12 object detection architecture.


Accelerating Non-Maximum Suppression: A Graph Theory Perspective King-Siong Si1* Lu Sun

Neural Information Processing Systems

Non-maximum suppression (NMS) is an indispensable post-processing step in object detection. With the continuous optimization of network models, NMS has become the "last mile" to enhance the efficiency of object detection.



3eb65004054f5d21fca4087f5658c727-AuthorFeedback.pdf

Neural Information Processing Systems

Thanks for the insightful and helpful reviews, which will significantly improve our paper. R1, R2, R3 indicate to whom the concern belongs. Ground truth is in red, predictions are in blue, and predicted eye gaze point of the gaze-based model is in green. SVMo bridges global and local context both spatially ( e.g., whole frame vs anchor Other contributions include exhaustive experiments which may be useful for future studies. In (c), the predicted gaze falls on the intersection of 3 objects, slightly closer to the center of the rabbit.


A Segmented Robot Grasping Perception Neural Network for Edge AI

arXiv.org Artificial Intelligence

--Robotic grasping, the ability of robots to reliably secure and manipulate objects of varying shapes, sizes and orientations, is a complex task that requires precise perception and control. Deep neural networks have shown remarkable success in grasp synthesis by learning rich and abstract representations of objects. When deployed at the edge, these models can enable low-latency, low-power inference, making real-time grasping feasible in resource-constrained environments. This work implements Heatmap-Guided Grasp Detection, an end-to-end framework for the detection of 6-Dof grasp poses, on the GAP9 RISC-V System-on-Chip. The model is optimised using hardware-aware techniques, including input dimensionality reduction, model partitioning, and quantisation. Object grasping synthesis is a fundamental challenge in robotics, underpinning applications such as automated warehouse operations, patient assistance in healthcare, and object sorting on assembly lines [1]. While humans excel at grasping objects of various shapes and sizes with precision, replicating this ability in robotics remains challenging.


Diffusion Suction Grasping with Large-Scale Parcel Dataset

arXiv.org Artificial Intelligence

While recent advances in object suction grasping have shown remarkable progress, significant challenges persist particularly in cluttered and complex parcel handling scenarios. Two fundamental limitations hinder current approaches: (1) the lack of a comprehensive suction grasp dataset tailored for parcel manipulation tasks, and (2) insufficient adaptability to diverse object characteristics including size variations, geometric complexity, and textural diversity. To address these challenges, we present Parcel-Suction-Dataset, a large-scale synthetic dataset containing 25 thousand cluttered scenes with 410 million precision-annotated suction grasp poses. This dataset is generated through our novel geometric sampling algorithm that enables efficient generation of optimal suction grasps incorporating both physical constraints and material properties. We further propose Diffusion-Suction, an innovative framework that reformulates suction grasp prediction as a conditional generation task through denoising diffusion probabilistic models. Our method iteratively refines random noise into suction grasp score maps through visual-conditioned guidance from point cloud observations, effectively learning spatial point-wise affordances from our synthetic dataset. Extensive experiments demonstrate that the simple yet efficient Diffusion-Suction achieves new state-of-the-art performance compared to previous models on both Parcel-Suction-Dataset and the public SuctionNet-1Billion benchmark.


Accelerating Non-Maximum Suppression: A Graph Theory Perspective

arXiv.org Artificial Intelligence

Non-maximum suppression (NMS) is an indispensable post-processing step in object detection. With the continuous optimization of network models, NMS has become the ``last mile'' to enhance the efficiency of object detection. This paper systematically analyzes NMS from a graph theory perspective for the first time, revealing its intrinsic structure. Consequently, we propose two optimization methods, namely QSI-NMS and BOE-NMS. The former is a fast recursive divide-and-conquer algorithm with negligible mAP loss, and its extended version (eQSI-NMS) achieves optimal complexity of $\mathcal{O}(n\log n)$. The latter, concentrating on the locality of NMS, achieves an optimization at a constant level without an mAP loss penalty. Moreover, to facilitate rapid evaluation of NMS methods for researchers, we introduce NMS-Bench, the first benchmark designed to comprehensively assess various NMS methods. Taking the YOLOv8-N model on MS COCO 2017 as the benchmark setup, our method QSI-NMS provides $6.2\times$ speed of original NMS on the benchmark, with a $0.1\%$ decrease in mAP. The optimal eQSI-NMS, with only a $0.3\%$ mAP decrease, achieves $10.7\times$ speed. Meanwhile, BOE-NMS exhibits $5.1\times$ speed with no compromise in mAP.